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The Rugged Metropolis (RM) algorithm is a biased updating scheme, which aims at directly hitting 
the most likely configurations in a rugged free energy landscape. Details of the one- variable (RMi) 
implementation of this algorithm are presented. This is followed by an extension to simultaneous 
updating of two dynamical variables (RM2). In a test with Met-Enkephalin in vacuum RM2 improves 
conventional Metropolis simulations by a factor of about four. Correlations between three or more 
dihedral angles appear to prevent larger improvements at low temperatures. We also investigate a 
multi-hit Metropolis scheme, which spends more CPU time on variables with large autocorrelation 
times. 



PACS: 05.10.Ln, 87.15-v, 87.14.Ee. 

I. INTRODUCTION 

Simulations of biomolecules are one of the major chal- 
lenges in computational science. Rugged free energy 
landscapes are typical for such systems. In this context 
a Rugged Metropolis (RM) algorithm was introduced in 
Ref. [1]. The motivation of RM was an elaboration of the 
funnel picture of protein folding, which was originally for- 
mulated by Bryngelson and Wolynes [2]. RM uses a bi- 
ased Metropolis algorithm, with the bias of the updating 
proposal obtained using data from previous simulations 
at higher temperatures. Although the possibility of con- 
structing biased Metropolis algorithms has been known 
for many years [3] , and these have occasionally been used 
in the statistical physics [4,5] and bio-chemical [6-8] lit- 
erature, it seems that a systematic understanding of the 
possibilities of biased Metropolis procedures is still in its 
infancy. For instance, it was only recently noted [9] that 
biased one- variable updates allow one to imitate the heat 
bath (Gibbs sampler) updates and can still be efficient 
when the conventional calculation of heat bath probabil- 
ities becomes prohibitively slow. 

The RM approach is distinct from generalized ensem- 
ble simulations. Generalized ensembles (for reviews and 
recent work see [10]) also use information from higher 
temperatures, but in an entirely different way. In a 
sense generalized ensembles build bridges in a rugged 
free energy landscape, while the RM scheme aims to en- 
hance the likelihood for a direct hit of the needle in the 
haystack. In fact RM updates can be implemented within 
any generalized ensemble. In a test case of RM updates 
within a replica exchange simulation, the improvement 
was multiplicative [1]. 

The main technical challenge within the RM scheme is 
to obtain, from available time series data, estimates of the 
multi-variable probability densities (pds) in a form that 
allows for their fast numerical evaluation. So far this was 
only achieved for one-variable pds, resulting in the RMi 



update scheme. However, it is well-known that many de- 
grees of freedom in a protein molecule are coupled. In 
addition, one needs multi-variable moves to avoid steric 
clashes [11] (cf. [8] and references therein for more recent 
literature). As a next approximation to the desired RM 
probabilities, in this paper we deal with pds of two vari- 
ables to develop and test the corresponding RM 2 update 
scheme. 

In the present paper all our Monte Carlo (MC) simu- 
lations are done in the canonical ensemble for the brain 
peptide Met-Enkephalin in vacuum, which has been a 
frequently used test case since its initial numerical in- 
vestigation in Ref. [12]. For this (artificial) system the 
coil-globule transition temperature is at Tg « 295 K and 
the folding temperature is at Tf w 230 K according to 
Ref. [13]. Long living traps are found at the glass tran- 
sition temperature, which is for Met-Enkephalin below 
the folding temperature at T g sa 180 K [14]. In our 
simulations we cover a range from 400 K down to 220 K 
and measure integrated autocorrelation times (see, e.g., 
Ref. [15] for the definition) to determine the performance 
of our algorithms. 

In section II we review the RM scheme and its RMi ap- 
proximation, filling in many details which inevitably had 
to be omitted in the letter format of Ref. [1]. On the fly 
we also investigate a multi-hit updating procedure, which 
spends more computer time on variables with large inte- 
grated autocorrelation times. In section III we introduce 
and test a RM2 scheme. Summary and conclusions follow 
in section IV. 



II. RM AND THE RMi APPROXIMATION 

We consider biomolecule models for which the en- 
ergy E is a function of a number of dynamical vari- 
ables Vi, i = 1, ...,n. The fluctuations in the Gibbs 
canonical ensemble are described by a probability den- 
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sity (pd) p(vi, . . . ,v n ;T), where T is the temperature. 
To be definite, we use in the following the all-atom en- 
ergy function ECEPP/2 (Empirical Conformational En- 
ergy Program for Peptides) [16]. Our dynamical variables 
Vi are the dihedral angles, each chosen to be in the range 
— 7r < Vi < 7r, so that the volume of the configuration 
space is K = (2n) n . Details of the energy functions are 
expected to be irrelevant for the algorithmic questions 
addressed here. Our test case will be the small brain 
peptide Met-Enkephalin, which features 24 dihedral an- 
gels as dynamical variables, see table I (the conventions 
follow Ref. [17], which differs from [12]). Besides the <p, ip 
angles, we keep also the to angles unconstrained, which 
are usually restricted to [it — tt/9, tt + tt/9}. This allows us 
to illustrate the RM idea for a particularly simple case. 

TABLE I. Acceptance rates for dihedral angle movements. 
They are accurate to about ±1 in the last digit. 



var 


angle 


residues 


400 K 


300 K 


300 K 








Metro 


Metro 


RMi 


vi 


x L 


Tyr-1 


0.107 


0.070 


0.272 


V2 


x 2 


Tyr-1 


0.182 


0.128 


0.343 


V-A 


n — 

x a 


Tyr-1 


0.497 


0.377 


0.680 


VA 




Tyr-1 


0.392 


0.340 


0.547 


V5 


1> 


Gly-2 


0.096 


0.044 


0.139 


V6 


LO 


Gly-2 


0.049 


0.034 


0.416 


V7 


<P 


Gly-2 


0.112 


0.045 


0.076 


v& 


</> 


Gly-3 


0.106 


0.038 


0.064 


Vg 


LO 


Gly-3 


0.041 


0.025 


0.301 


VlO 




Gly-3 


0.088 


0.035 


0.070 


Vn 


rl> 


Phe-4 


0.115 


0.040 


0.077 


V12 


LO 


Phe-4 


0.047 


0.030 


0.368 


V13 


x l 


Phe-4 


0.109 


0.086 


0.277 


Vl4 


x 2 


Phe-4 


0.192 


0.166 


0.403 


Vw 


<t> 


Phe-4 


0.082 


0.042 


0.139 


via 


rl> 


Met-5 


0.122 


0.063 


0.156 


vir 


to 


Met-5 


0.062 


0.047 


0.573 


Vis 


x L 


Met-5 


0.117 


0.092 


0.362 


Vig 


2 

X 


Met-5 


0.159 


0.121 


0.585 


V20 


X s 


Met-5 


0.269 


0.211 


0.709 


V21 


x 4 


Met-5 


0.455 


0.385 


0.833 


V22 


<t> 


Met-5 


0.129 


0.086 


0.258 


V23 


i> 


Met-5 


0.378 


0.267 


0.469 


V24 


LO 


Met-5 


0.114 


0.096 


0.873 


E 






0.168 


0.119 


0.375 



The Metropolis importance sampling would be per- 
fected, if we could propose new configurations {v[} with 
their canonical pd. This is not possible as no Metropo- 
lis simulation would be necessary if the canonical pd 
were known. But conventional Metropolis simulations 
work well at sufficiently high temperatures X" and can 
thus provide an estimate p(vi, . . . , v n ; T") of the pd 
p(vi, . . . , v n ;T'). Due to the funnel picture, we expect 
that such an estimate can be used to feed useful informa- 
tion into the simulation at a sufficiently close-by lower 



temperature T < T' [1]. The idea of the RM scheme 
is to propose a transition from a configuration {vi} to a 
new configuration {«■} with the pd p( w i> . . . , v' n ;T') and 
to accept it with the probability 



P„. = min 



exp (-/?£') p( Vl ,...,v n ;T') 



exp (-/?£) p(v' 1 ,...,v' n ;T') 



(1) 



where (i — l/(kT). This equation biases the a-priori 
probability of each dihedral angle with an estimate of its 
pd from a higher temperature. Arbitrary gneralized en- 
sembles can be treated similarly by replacing exp(— (i E') 
and exp(— [3 E) in Eq. (1) by the appropriate probabili- 
ties Pg(E') and P g {E) of the generalized ensemble. 
For a range of temperatures 



Ti > T 2 > 



> T r > 



> T f -! > T f (2) 



the simulation at the highest temperature, Xi, is per- 
formed with the usual Metropolis algorithm and the re- 
sults are used as input for the simulation at T%. The 
estimated pd p(v\, . . . , v n ; T r _i) is expected to be a use- 
ful approximation of p(v\, . . . , v n ; T r ), therefore alllow- 
ing the scheme to zoom in on the native structure that 
is dominant at the physically relevant final temperature 

To get things started, we need to construct an estima- 
tor ~p{v\, . . . , v n ; T r ) from the numerical data of the RM 
simulation at temperature T r . Although this is neither 
simple nor straightforward, a variety of approaches offer 
themselves to define and refine the desired estimators. 

In Ref. [1] the approximation 



p(v!, . . .,v n ;T r ) 



}Jpl(v t ;T r ) 



(3) 



was investigated, where p\(vi\T r ) are estimators of re- 
duced one-variable pds defined by 



Pi{vi-,T) 



/+7T ^ 
\\_dvj p{v-i,...,v n ;T) . 



(4) 



The resulting algorithm, called RMi, constitutes the sim- 
plest RM scheme possible. 

Let us fill in the details of the RMi implementation [1]. 
To update with the RMi weights it is convenient to rely 
on the cumulative distribution functions defined by 



Fi(v) = f dv'p\(v') . 

J —IT 



(5) 



The estimate of i*io, the cumulative distribution function 
for the dihedral angle Gly-3 («io), from the vacuum 
simulations at our highest temperature, T\ = 400 K, is 
shown in Fig. 1 (this is the same angle for which his- 
tograms at 400 K and 300 K are shown in Ref. [1]). For 
our plots in the present paper we use degrees, while we 
use radiant in our theoretical discussions and in the com- 
puter programs. Fig. 1 is obtained by sorting all na a t 
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FIG. 1. Estimate of the cumulative distribution function 
for the Met-Enkephalin dihedral angle v 10 (Gly-3 4>) at 400 K. 



values of v w in our time series in ascending order and 
increasing the values of Fiq by 1/ndat whenever a mea- 
sured value of vio is encountered. Using a heapsort ap- 
proach, the sorting is done in ndat log 2 («dat) steps (see, 
e.g., Rcf. [15]). 

Next we divide the ordinate between and 1 into n ta b 
equal segments. The value of ntab has to be small enough 
that a table of size n x n ta b fits conveniently into the 
computer RAM. For each integer j = 1 , . . . , n tab the 
value Fij = j /ntab defines a unique value Vij through 
Fij = Fi(vij) as is indicated in the figure (for which 
i = 10). Furthermore, for each choice of a dihedral angle 
(i.e., a particular value of i) we define the differences 



Vi. 



Vi. 



_i with V{ o = — 7r 



(6) 



The grid in Fig. 1 shows the discretization for the variable 
i>io and the choice ntab = 16. While the discretization 
for F\o on the ordinate is uniformly spaced, widely vary- 
ing intervals are obtained for v w on the abscissa. The 
Metropolis procedure for one update of a dihedral angle 
Vi is now specified as follows: 

1. Place the present angle Vi on the discretization grid, 
i.e., find the integer j through the relation Vij-i < 



Vi < v 



1,3 



For one-variable updates j is available in 
the computer memory if it is stored at the previous 
update. Otherwise, j can be re-calculated in n 2 
steps for the choice n ta b = 2" 2 [9]. 

2. Pick an integer j' uniformly distributed in the range 

1 to n t ab- 



+ x r Avij', where < x r < 1 



3. Propose v[ — v^, 
is a uniformly distributed random number. 

4. Accept v[ with the probability 



exp(-j3E') Av t 



' exp(-PE)Av 



1,3 



(7) 



It is through the widely varying ratios Aviji/Avij 
that importance sampling for the rugged variables be- 
comes improved. Back to our illustration in Fig. 1: The 
short and the long interval on the abscissa are proposed 
with equal probabilities, i.e., the a-priori probability den- 
sity for our angle is high in short intervals and low in 
long intervals. The CPU time consumption of the RMi 
scheme is practically identical with that of the conven- 
tional Metropolis algorithms, because the bulk of the 
CPU time is spend on the calculation of the new energy 
E'. 



A. Numerical results 

The performance of the RMi algorithm is tested at 
300 K using input from a simulation at 400 K. The tem- 
perature of 400 K is high enough so that the conventional 
Metropolis algorithm is efficient, while it is low enough to 
provide useful input for the simulation at 300 K, a tem- 
perature at which one experiences a considerable slowing 
down in a conventional Metropolis simulation of Met- 
Enkefalin. 

Our Metropolis simulations are performed with a vari- 
ant of SMMP (Simple Molecular Mechanics for Pro- 
teins) [17]. For each simulation a time series of 2 17 = 
131, 072 configurations is kept, sampling every 32 sweeps. 
A sweep is defined by updating each dihedral angle once, 
which we do in the sequential order of the angles listed in 
table I. Usually sequential updating is more efficient than 
random updating [15]. Before starting with the measure- 
ments, 2 18 = 262, 144 sweeps are performed for reaching 
equilibrium. Thus, the entire simulation at one tempera- 
ture uses 2 18 + 32 ■ 2 17 = 4, 456, 448 sweeps. On a 1.9 GHz 
Athlon PC this takes under 12 hours. For each dihedral 
angle the acceptance rate of the Metropolis algorithm 
was monitored at run time and, following the recipes of 
[15], the integrated autocorrelation time r; n t is calculated 
from the recorded time series. 

Acceptance rates for dihedral angle movements are 
compiled in table I. For the energy entry it is the ra- 
tio of all accepted over all proposed moves. Results are 
given for simulations with the conventional Metropolis al- 
gorithm at 400 K and 300 K, and for the RMi simulations 
at 300 K. The RMi updating uses a discretization with 
"tab = 2 7 = 128 from the 400 K Metropolis data. Ac- 
ceptance rates greater than 0.3 are desirable [15]. From 
the table we notice that the acceptance rates vary greatly 
from angle to angle. For the Metropolis simulation the 
values arc in the interval [0.041,0.497] at 400 K and in 
[0.025,0.387] at 300 K. For both temperatures v g corre- 
sponds to the lowest value, while v^ and V21 correspond 
to the highest values. 
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FIG. 2. Estimate of the cumulative distribution function 
for the Met-Enkephalin dihedral angle Vg (Gly-2 tu) at 400 K. 



Our RMi updating at 300 K increases the acceptance 
rate for each angle, often even beyond the Metropolis 
acceptance rate at 400 K, as is obvious from the average 
value listed for the energy. A second look reveals that the 
increase in the acceptance rate varies greatly from angle 
to angle. While for some angles the problem of low accep- 
tance rates is entirely solved, for others the improvement 
remains modest. For instance for all u angles the increase 
is dramatic, e.g., from 0.034 to 0.416 for v 6 . Angles with 
little improvements are v? (0.045 — > 0.076), vg (0.038 — > 
0.064), v w (0.035 -» 0.070), and v n (0.040 -» 0.077). 
Better, but still not particularly impressive, is the in- 
crease in the acceptance rates of v&, wis and viq. All 
these are <j>, tp angles around C a atoms. For all other an- 
gles RMi updating has moved the acceptance rate above 
or at least close to 0.3. 

The improvement for u> angles is most easily under- 
stood. Figure 2 shows the cumulative distribution func- 
tion for vg (Gly-2 u) at 400 if, which is the angle of 
lowest acceptance rate in the conventional Metropolis up- 
dating. This distribution function corresponds to a his- 
togram narrowly peaked around ±7r, which is explained 
by the specific electronic hybridization of the CO-N pep- 
tide bond. From the grid shown in Fig. 2 it is seen that 
the RMi updating concentrates the proposal for this an- 
gle in the range slightly above — tt and slightly below +tt. 
Thus the procedure has a similar effect as the often used 
restriction to the range [tt— it/9, tt+tt/9], which is also the 
default implementation in SMMP (the range [tt, tt + tt/9] 
is, of course, [—tt, —tt + tt/9] in our plots). 

Although acceptance rates give some insights, the de- 
cisive quantity for the performance of an algorithm is 
the more difficult to calculate integrated autocorrelation 
time T; nt . To achieve a pre-defined accuracy, the com- 
puter time needed is directly proportional to Ti nt . 

In table II the integrated autocorrelation times are 



TABLE II. Integrated autocorrelation times for dihedral 
angle movements in units of 32 sweeps. 



var 


400 K 


300 K 


300 K 


300 K 




Metro 


Metro 


RMi 


RM 2 


Vl 


2.11 (06) 


15.2 (1.5) 


9.07 (58) 


6.03 (47) 


V2 


1.18 (02) 


2.70 (16) 


1.63 (09) 


1.70 (12) 


V3 


1.03 (01) 


2.18 (14) 


1.26 (04) 


1.24 (04) 


t>4 


1.44 (03) 


4.44 (23) 


3.28 (21) 


2.82 (14) 


V5 


5.44 (20) 


54.5 (5.4) 


26.3 (1.5) 


20.0 (1.3) 


V6 


2.95 (07) 


23.3 (2.7) 


8.65 (58) 


6.00 (34) 


V 7 


5.83 (29) 


103 (14) 


52.9 (4.3) 


24.3 (1.3) 


V8 


7.36 (22) 


125 (12) 


74.2 (6.9) 


35.0 (2.7) 


Vg 


4.39 (13) 


32.0 (2.2) 


14.2 (1.0) 


8.84 (48) 


VlO 


9.08 (88) 


124 (12) 


80.6 (6.9) 


34.3 (2.8) 


Vn 


5.39 (45) 


105 (08) 


72.4 (5.5) 


31.3 (1.9) 


V12 


3.37 (08) 


15.6 (1.5) 


5.68 (39) 


3.92 (17) 


V13 


1.81 (05) 


8.79 (46) 


5.69 (54) 


3.59 (22) 


V14 


1.15 (02) 


1.65 (10) 


1.40 (07) 


1.26 (06) 


V15 


6.72 (28) 


105 (12) 


45.6 (2.7) 


27.5 (4.5) 


Vl6 


9.28 (28) 


133 (09) 


75.2 (5.2) 


33.9 (2.1) 


vn 


1.90 (04) 


9.69 (79) 


3.89 (36) 


2.29 (08) 


«18 


1.66 (05) 


12.0 (1.6) 


6.48 (78) 


5.11 (28) 


Vig 


1.1 / yyJZ j 


1 fi^ fflft^ 

l.OO l^UoJ 


1 1 (n^ 

1.1D ^UoJ 


117 

1.1 f yUo ) 


V20 


1.02 (01) 


1.08 (02) 


1.03 (02) 


1.02 (02) 


V21 


1.00 (01) 


1.00 (01) 


1.00 (01) 


1.02 (01) 


V22 


3.20 (12) 


35.9 (4.0) 


18.2 (1.2) 


12.0 (0.8) 


V23 


1.50 (04) 


20.3 (1.8) 


11.0 (0.6) 


5.96 (35) 


«24 


1.07 (02) 


1.22 (05) 


1.00 (01) 


1.00 (01) 


E 


4.89 (21) 


50.7 (5.0) 


26.0 (1.4) 


14.2 (0.7) 



compiled for all angles and for the energy. The values 
are statistically consistent with those of Ref. [1]. Devi- 
ations are due to re-runs and using different procedures 
for estimating integrated autocorrelation times. In all 
tables they are given in units of 32 sweeps, as this is the 
step-size of our MC time series. Error bars are shown in 
parenthesis. For these calculations we use the routines 
of Ref. [15] together with a jackknife error analysis as 
explained there. The angles V7, vg, v\o, ^11,^15 and v\6 
exhibit autocorrelation times > 100 in the conventional 
Metropolis simulation at 300 K. Note that four of these 
are those with the worst improvement of acceptance rates 
when moving to the RMi updating, while the remaining 
two belong to the subsequent group with still rather poor 
improvement. 

The increase in magnitude in the autocorrelation times 
for these six angles is remarkable when the temperature 
of the conventional Metropolis simulation is lowered from 
400 K to 300 K. This shows that the standard Metropolis 
algorithm is efficient at 400 K but not so at 300 K. On 
the other hand, the distribution of the variables is not 
dramatically changed, at least to the extent that this can 
be judged from one-variable histograms, as is illustrated 
in Ref. [1] for vio. This is the reason why the 400 K 
simulation provides useful input for the RMi simulation 
at 300 K. 



The RMi updating reduces the integrated autocorre- 
lation times at 300 K by factors of about two, for in- 
stance for vr from 103 to 53. The Tint values vary greatly 
from angle to angle. While some angles show no au- 
tocorrelations after 32 sweeps (r; n t = 1 or close to it), 
the largest value on record for RMi updating at 300 K is 
Tint = 80±7 for vio (down from 124 for Metropolis updat- 
ing at 300 K). That the RMi updating does not reduce 
the large autocorrelation times more efficiently has ob- 
viously to do with correlations between different angles. 
Notably even moves of some of the u angles, like vg with 
Tint = 14.2 ± 1.0, appear considerably correlated with the 
rest of the molecule. RM variants which move several dy- 
namical variables collectively are required and our RM2 
implementation for simultaneous updates of two dihedral 
angles is discussed in section III. First let us address a 
multi-hit Metropolis procedure. 

B. Multi-hit updating 

Our sequential updating hits each angle once. The 
greatly varying integrated autocorrelation times of ta- 
ble II suggest that the computer time may be more ef- 
ficiently used by performing several Metropolis hits for 
variables with large integrated autocorrelation times, to 
be called "bad" variables in the following. 

To find an optimal choice for the number of hits per 
variable requires some thought. At 300 K the integrated 
autocorrelation times of the dihedral angles vary between 
Tint = 1 and r int w 133 for the conventional Metropolis 
updating and still between n nt = 1 and T; n t ~ 80 for the 
RMi algorithm. It is certainly not a good idea to choose 
the number of hits per variable in proportion to Tint, be- 
cause we expect correlations between angles to be the 
main obstacle for reducing large integrated autocorrela- 
tion times. A scheme with a large number of hits mimics 
the heat-bath algorithm (e.g., Rcf. [15]), which sets the 
upper bound to the gain in performance, but does not 
resolve the problem of correlations between angles. So a 
modest increase in the number of hits per bad variable 
may increase the performance of the updating, while a 
further increase will result in the contrary. 

A guideline for choosing the number of hits is obtained 
from the observation that the previously obtained accep- 
tance rates per update attempt do not change when per- 
forming multiple hits. It appears reasonable to increase 
the hits of bad variables while bounding the number of 
hits times the acceptance rate by 0.3 from above. As 
the acceptance rates change considerably when switch- 
ing from regular Metropolis to RMi updating, we employ 
different schemes for the two cases. Results for the two 
different multi-hit schemes are collected in table III. 

The numbers in the first "hits" column are used for the 
regular Metropolis updating. They are arranged to add 
up to 48, i.e., twice the total number of variables. The 
additional computer time needed is balanced by reduc- 



TABLE III. Multi-hit performance. 



var 


48 


400 K 


300 K 


39 


300 K 


300 K 




hits 


Metro 


Metro 


hits 


RMi 


RM 2 


Vl 


2 


2.05 (07) 


14.2 (1.1) 


1 


7.29 (52) 


5.66 (42) 


V2 


1 


1.37 (03) 


3.29 (23) 


1 


1.56 (04) 


1.91 (05) 


V3 


1 


1.04 (04) 


2.15 (10) 


1 


1.39 (04) 


1.51 (06) 


V4 


1 


1.47 (03) 


5.49 (57) 


1 


2.74 (12) 


3.09 (16) 


vs 


3 


3.73 (08) 


48.1 (5.7) 


2 


21.8 (1.7) 


20.3 (1.0) 


V6 


1 


4.19 (09) 


19.3 (1.2) 


1 


7.12 (35) 


5.11 (22) 


V7 


4 


3.96 (21) 


61.7 (3.5) 


4 


42.3 (2.9) 


20.7 (1.2) 


Vs 


4 


5.06 (19) 


81.8 (5.2) 


4 


50.5 (4.1) 


24.1 (1.2) 


V9 


2 


4.21 (13) 


25.0 (1.6) 


1 


12.7 (1.0) 


8.14 (47) 


Vio 


4 


5.28 (20) 


86.1 (6.4) 


4 


48.0 (4.4) 


24.7 (1.6) 


Vll 


4 


3.72 (14) 


81.1 (7.2) 


4 


53.7 (4.8) 


25.5 (1.4) 


V12 


2 


3.04 (13) 


11.3 (0.6) 


1 


4.82 (42) 


4.61 (26) 


V13 


1 


2.39 (05) 


9.36 (96) 


1 


4.36 (29) 


3.45 (26) 


via 


1 


1.34 (03) 


2.03 (15) 


1 


1.23 (03) 


1.28 (03) 


«15 


4 


4.65 (16) 


61.1 (4.4) 


2 


35.5 (3.2) 


20.3 (1.0) 


«16 


4 


6.29 (20) 


86.7 (8.5) 


2 


61.5 (5.0) 


29.5 (1.9) 


vn 


1 


2.86 (06) 


7.48 (42) 


1 


3.16 (30) 


2.18 (07) 


via 


1 


2.03 (05) 


11.7 (1.1) 


1 


7.79 (91) 


6.48 (51) 


«19 


1 


1.64 (04) 


2.57 (15) 


1 


1.16 (02) 


1.32 (03) 


V20 


1 
1 


1 no /'ni \ 


l.Zl {uZ) 


1 
1 


1.U1 {k)L) 




V21 


1 


1.00 (01) 


1.02 (02) 


1 


1.00 (01) 


1.00 (01) 


V22 


2 


2.98 (08) 


26.2 (1.8) 


1 


19.2 (1.6) 


13.9 (0.9) 


V23 


1 


1.51 (05) 


13.3 (1.1) 


1 


11.4 (1.0) 


5.60 (31) 


«24 


1 


1.44 (03) 


1.63 (04) 


1 


1.02 (01) 


1.03 (02) 


E 




4.25 (17) 


32.9 (1.4) 




24.9 (2.2) 


14.7 (1.3) 



ing the number of sweeps between measurements from 32 
to 16 (a sweep is now defined by applying the new up- 
dating procedure in sequential order once to each angle) . 
By comparing tables II and III we see that the multi- 
hit updating improves the Metropolis algorithm at 300 K 
considerably: the integrated autocorrelation time for the 
energy is down by about 40%. 

The numbers in the second "hits" column are used for 
RMi and RM 2 updating. As RMi updating increases 
acceptance rates already significantly, there is little op- 
portunity for additional improvements due to multiple 
hits. By that reason the numbers of the column add only 
up to 39 hits per sweep. This is balanced by reducing the 
number of sweeps between measurements from 32 to 20 
(the integer nearest to 32 x 24/39). There are still sig- 
nificant decreases in autocorrelations times for the bad 
variables, but the indicator for overall performance, the 
integrated autocorrelation time of the energy, shows only 
a modest 5% decrease when comparing to RMi without 
multiple hits and practially no change for RM 2 updat- 
ing, introduced next. The apparent reason is that these 
updating schemes are already much closer to a heat-bath 
scenario, so that the improvement due to multiple hits 
becomes offset by the additional computer time needed. 
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III. THE RM 2 APPROXIMATION 

We now generalize the RMi scheme of Eq. (7) to the 
simultaneous updating of two dihedral angles. For i\ ^ i 2 
the reduced two-variable pds are defined by 

/+TT 
II dvjp{vj,...,v n ;T) . (8) 

The one-variable cumulative distribution functions F^ 
and the discretization v^j, j = 0, . . . , n ta b are already 
given by Eqs. (5) and (6). We define conditional cumu- 
lative distribution functions by 

F ii,i2-,j(v) = dv i2 dv h p iui2 (v il ,v i2 ) (9) 

J—ir Jviij-i 

for which the normalization -Fii ,j 2 ;j ( 7r ) = V^tab holds. 
To extend the RMi updating to two variables we de- 
fine for each integer k = 1, . . . , ntab the value i ? i 1 ,» 2 ;j,fc = 
fc/(n ta b) 2 - Next we define Vi lt i 2 -j t k through F^^-^k = 
Fii,i2:j( v h,i2;j-k) an d also the differences 

AVij.ijy,* = V*i,i 2 y,fc - V*i,»2;j,fc-1 With U»i,» 2 ;j,0 = _ 7T • 

(10) 

The RM2 Metropolis procedure for the simultaneous up- 
date of (t>ij , «i 2 ) is then specified as follows: 

1. Find the grid index j for the present angle Vi ± 
through Vi lt j-i < v il < v^j, just like for RMi 
updating. 

2. Find the grid index k for the present angle Vi 2 
through Vi lii2 -j t k-l < Vi 2 < ^ii,i 2 ;j,fe- 

3. Pick two integers j' and k', each uniformly dis- 
tributed in the range 1 to ntab- (This could be 
extended to cover asymmetric ranges n\ ah x n^ ab .) 

4. Propose v' it = v^ji-i +x\ Av^ji, where < x\ < 
1 is a uniformly distributed random number. 

5. Propose v[ 2 = v^^-j^k'-i + x T 2 Av^^-ji^ 1 , where 
< x\~ < 1 is a second uniformly distributed ran- 
dom number. 

6. Accept (v'i jv'i ) with the probability 
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FIG. 3. Areas of equal probabilities (sorting 117 then v$). 
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FIG. 4. Areas of equal probabilities (sorting vs then vt). 



largest integrated autocorrelation times of the RMi pro- 
cedure and are expected to be strongly correlated with 
one another because they are pairs of dihedral angles 
around a C a atom. 

The bias of the acceptance probability given in Eq. (11) 
is governed by the areas 



Aft,.; Aw 



exp(-f3E') Av iuj > Av ilti2;j > t k' 



exp(-PE) Av iltj Av iu i 2 -j yk 



(11) 



As before, estimates of the conditional cumulative dis- 
tribution functions and the intervals Av^.^-j^ are ob- 
tained from the conventional Metropolis simulation at 
400 K. In the following we focus on the pairs (v-?,vs), 
{vio,Vu) and (vi 5 ,vi & ). These angles correspond to the 



For i\ — 6 and i 2 — 7 our 400 K estimates of these areas 
are depicted in Fig. 3. For the RM2 procedure these 
areas take the role which the intervals on the abscissa 
of Fig. 1 play for RMi updating. The small and the 
large areas are proposed with equal probabilities, so the 
a-priori probability for our two angles is high in a small 
area and low in a large area. In Fig. 3 the largest area is 
503.4 times the smallest area. Areas of high probability 
correspond to allowed regions in the Ramachandran map 
of a Gly residue [18]. 
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FIG. 5. Areas of equal probabilities (sorting vio then vu). 
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FIG. 6. Areas of equal probabilities (sorting vis then vw). 



Note that the order of the angles matters. The differ- 
ence between Fig. 3 and Fig. 4 is that we plot in Fig. 3 
the areas Ar^;j,k and in Fig. 4 the areas Ag^-j^ while 
the labeling of the axes is identical. This means that for 
Fig. 3 sorting is first done on the angle v~j (regardless of 
the value of vg) and then done on vg for which the corre- 
sponding value of vj is within a particular bin A«7, but 
for Fig. 4 it is first done one vg and then on vj. In Fig. 4 
the largest area is 396.4 times the smallest area. 

Fig. 5 and Fig. 6 give plots for the (wio,«ii) and 
(vi5,viq) pairs in which the angle with the smaller sub- 
script is sorted first. The ratio of the largest area over 
the smallest area is 650.9 for (wio,«n) and 2565.8 for 
(vi5,vi§). The large number in the latter case is related 
to the fact that (vi 5 , viq) is the pair of <j>, tp angles around 
the C a atom of Phe-4, for which positive cj> values are dis- 
allowed [18]. 

The RM 2 scheme which we have tested adds updates 
for the three pairs (v-?,vs), (wio, w n) and (vi 5 ,viq) after 



one-angle updates for all the 24 angles with the RMi 
scheme. For each pair both orders of sorting are used, so 
that we add altogether six new updates. The bookkeep- 
ing for this process is a bit tricky, because an accepted 
update changes not only (j, k) — > (j 1 , k'), but also the the 
j from the RMi updating of the angles. The latter corre- 
sponds to a different table and needs to be re-calculated 
from the new value of the angle. As already mentioned, 
this can be done in log 2 (n ta b) steps [9]. Similarly, ac- 
cepted RMi updates can change the initial RM 2 (j, k) 
values, so that they may have to be re-calculated. The 
six RM 2 update tables, each of size 16 x 16, are built 
from the 400 K Metropolis simulation, and the areas of 
four of them are precisely those shown in Figs. 3 to 6. 



A. Numerical Results 

We have checked the correctness of our updating pro- 
cedure by comparing high precision energy averages and 
other observables with results from previous calculations. 
The acceptance rates of the one- variable updates remain 
the same as they were for RMi procedure. For the ac- 
ceptance rate of a pair we average over the two cases. 
Table IV compares the two-angle RM 2 acceptance rates 
at 300 K to those obtained by proposing the same two- 
angle updates with the standard Metropolis procedure. 
At 300 K an increase by factors in the range from three 
to nearly ten is found. However, the values remain sur- 
prisingly low, presumably due to substantial correlations 
with additional angles. 

TABLE IV. Acceptance rates for simultaneous moves of 
angle pairs. 



variable pair 


400 K 


300 K 


300 K 


300 K 




Metro 


Metro 


RM 2 


RM 2 








"tab = 16 


n tab = 128 


(«7, Vg,) 


0.044 


0.0060 


0.019 


0.020 


(vio,tm) 


0.041 


0.0051 


0.021 


0.022 


(visjfie) 


0.018 


0.0051 


0.048 


0.050 



Integrated autocorrelation times are calculated to eval- 
uate the improvement of the overall performance. For 
this purpose the number of sweeps between measure- 
ments is reduced from 32 to 26 to account for the ad- 
ditional CPU time needed for the two-angle moves. The 
results are presented in table II. Despite the small ac- 
ceptance rates for the two-angle moves, the integrated 
autocorrelation times for the targeted angles are substan- 
tially reduced. For all the six angles they are smaller by 
factors larger than two when compared with the RMi re- 
sults. Interestingly this speed-up propagates through the 
entire system and the integrated autocorrelation time for 
the energy is found to be about a factor of two smaller 
than for the RMi algorithm. 



7 



Multi-hit updates allow us to focus even more on the 
angles with large autocorrelation times. For the one- 
angle updates we use the same numbers of hits as for 
RMi updating (see table III). In addition we use four 
hits for the pairs (v?,vg) and (wio,wn) and two hits for 
the {vi5,vi 6 ) pair. For each each pair both orders of the 
updating are used. Altogether we perform 39 one-angle 
and 20 two-angle hits per sweep, which is balanced by re- 
ducing the number of sweeps between measurements to 
13 (the integer nearest to 32 x 24/59). The results for in- 
tegrated autocorrelation times are a mixed bag (compare 
the last columns of tables II and III). While the values 
for the targetted angles indeed go down, in particular for 
v&, the improvement does not propagate to the energy. 



TABLE V. Multi-hit Metropolis at low temperatures. 



var 


48 


280 K 


260 K 


240 K 


220 K 




hits 










Vl 


2 


24.5 (1.7) 


59.8 (4.8) 


246 (18) 


658 (60) 


V5 


3 


70.7 (5.0) 


236 (27) 


531 (47) 


1672 (148) 


ve> 


1 


34.8 (2.4) 


65.0 (4.8) 


185 (10) 


453 (28) 


t>7 


4 


156 (16) 


526 (88) 


1425 (118) 


4434 (405) 


v& 


4 


191 (17) 


627 (78) 


1591 (135) 


4965 (485) 


V9 


2 


60.2 (5.0) 


114 (06) 


467 (33) 


1809 (219) 


Vio 


4 


186 (15) 


613 (69) 


1530 (131) 


5326 (443) 


Vn 


4 


214 (18) 


625 (82) 


1901 (148) 


5781 (437) 


Vl2 


2 


20.1 (1.1) 


44.3 (3.2) 


158 (08) 


557 (44) 


V13 


1 


15.7 (05) 


36.8 (2.9) 


119 (07) 


279 (17) 


Vl 5 


4 


239 (90) 


308 (31) 


999 (78) 


2283 (170) 


Vl6 


4 


204 (12) 


449 (34) 


1600 (129) 


4537 (344) 


vir 


1 


11.9 (0.7) 


22.8 (1.2) 


78.8 (2.7) 


278 (15) 


vis 


1 


24.1 (2.6) 


55.5 (3.6) 


208 (11) 


755 (46) 


V22 


2 


64.5 (4.7) 


136 (10) 


343 (16) 


898 (54) 


«23 


1 


37.2 (1.8) 


84.4 (5.1) 


360 (26) 


830 (63) 


E 




71.6 (4.0) 


148 (08) 


484 (29) 


1536 (121) 



In tables V and VI we present results down to 220 K 
for the multi-hit improved Metropolis and RM2 algo- 
rithms, where the distribution of hits is the same as listed 
in table III. We always construct the RM 2 table from 
the previous RM2 runs at the next higher temperature. 
We also generate RM 2 data without using the multi-hit 
scheme, with the resulting autocorrelation times consis- 
tcnly higher than those reported in table VI. We have 
not listed the angles V2, V3, V4, V14, i>ig, 1*20, V21 and V24 
in tables V and VI, because they are not significantly 
correlated with the rest of the molecule. 

When lowering the temperature towards 220 K, the au- 
tocorrelation times increase rapidly. To control Tint we 
double the number of sweeps between measurements each 
time, when decreasing the temperature. So it is 64 at 
280 K, 128 at 260 K, 256 at 240 K and 512 at 220 K. For 
the multi-hit Metropolis simulations this was still not suf- 
ficient and we performed two additional runs, with 4 x 256 
sweeps at 240 K and with 4 x 512 sweeps at 220 K. This 



TABLE VI. RM 2 multi-hit at low temperatures. 



var 


39 


280 K 


260 K 


240 K 


220 K 




hits 










Vl 


1 


11.6 (0.7) 


27.5 (1.2) 


81.6 (6.2) 


188 (017) 


V5 


2 


36.9 (1.8) 


89.5 (7.4) 


256 (26) 


824 (103) 


V6 


1 


10.7 (0.6) 


20.1 (0.7) 


50.4 (2.4) 


146 (017) 


V7 


4 


45.0 (3.1) 


105 (05) 


359 (27) 


1434 (164) 


t>8 


4 


63.0 (4.0) 


131 (07) 


459 (42) 


2266 (309) 


V9 


1 


17.5 (0.8) 


44.9 (1.9) 


130 (10) 


405 (030) 


VIO 


4 


59.9 (4.1) 


134 (09) 


514 (57) 


2344 (300) 


Vn 


4 


57.1 (3.6) 


157 (10) 


495 (35) 


1965 (178) 


V12 


1 


8.04 (30) 


20.1 (1.0) 


52.9 (3.4) 


131 (008) 


V13 


1 


6.55 (25) 


12.9 (0.7) 


39.0 (3.5) 


86.7 (6.8) 


Vl5 


2 


43.5 (6.4) 


88.8 (4.9) 


265 (18) 


775 (082) 


via 


O 
A 


OO.i \*-o) 


14U \uo) 


/ion fQO^ 
4zU yoZ ) 


1 ccjq /1 a a \ 
lODy ^104j 


vn 


1 


4.43 (20) 


9.72 (44) 


30.7 (1.9) 


69.2 (4.4) 


«18 


1 


10.7 (0.5) 


26.5 (1.2) 


65.6 (2.6) 


206 (12) 


V22 


1 


25.4 (1.3) 


53.6 (2.7) 


124 (05) 


281 (014) 


V23 


1 


11.6 (0.6) 


28.2 (1.7) 


88.2 (5.2) 


267 (022) 


E 




22.9 (1.1) 


47.9 (1.8) 


128 (09) 


426 (038) 



explains the relatively small error bars in the last two 
columns of table V. In the tables we continue to re- 
port Ti n t in units of 32 sweeps, multiplying the measured 
Ti n t/32 value with the number of sweeps between mea- 
surements. It is seen that even for the RM 2 simulation 
the Tint increases are not compensated by the increase 
of computer time. In contrast to that, the decrease in 
acceptance rates is rather moderate, less than a factor 
of two when the temperature is lowered from 300 K to 
220 K. 

The results of tables V and VI show that our RM2 
sampling accelerates the conventional Metropolis simu- 
lations by a rather temperature independent factor. As 
we can assume that the multi-hit Metropolis simulations 
already improve conventional Metropolis simulations by 
about 40%, the RM2 accelaration is by a factor between 
four and five with respect to a conventional Metropolis 
simulation. For large scale simulations factors larger than 
two are clearly of importance, but it remains a bit puz- 
zling why the improvement does not increase upon lower- 
ing the temperature, as it is found when using generalized 
ensembles. Apparently coordinated moves of three and 
more angles are needed. 

IV. SUMMARY AND CONCLUSIONS 

We have reviewed the one- variable approximation RMi 
of the rugged Metropolis (RM) scheme of Ref. [1] and 
worked out a two- variable approximation RM2 for simul- 
taneous moves of two dihedral angles. As before the test 
system has been Met-Enkcphalin. A gain of a factor of 
four over conventional Metropolis simulations has been 
demonstrated at 300 K. 
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Although the elaboration of the RM scheme seems to 
be on track, much work is left to be done. Even for a 
system as simple as Met-Enkcphalin it remains unclear 
which kinds of correlations are responsible for the still 
low acceptance rates of the two-angles moves. On the 
other hand it is encouraging to sec that the autocorrela- 
tions times of these angles are nevertheless substantially 
reduced and that this effect propagates through the en- 
tire system. Other test cases need to be investigated to 
get a broader understanding of the observed features. In 
particular one would like to know how the performance 
gain depends on the system size. 

Somewhat puzzling is the lack of enhanced improve- 
ments at lower temperatures. The real future of biased 
updating procedures may lie in their implementation for 
generalized ensembles. 

Presently the leading method for simulations of 
biomolecules is molecular dynamics (see [19] for a text- 
book). This is to some extent surprising, because Markov 
Chain Monte Carlo (MC) simulations allow for large 
changes of conformations in a single move, so thermo- 
dynamically relevant equilibrium configurations can, in 
principle, be reached quickly. However, in simulations of 
biomolecules with an explicit inclusion of solvent inter- 
actions, large MC moves face the problem that there will 
not be a suitable cavity in the solvent to accommodate 
a large distortion of the molecule shape. While the RM 
method discussed in this paper decreases the likelihood 
of steric clashes in a vacuum simulation, it has no im- 
mediate translation into the situation of explicit solvent 
models. 

The way out may be the use of implicit solvent models, 
for which the change in the molecule-solvent and solvent- 
solvent interaction energies can be calculated instanta- 
neously, like in a vacuum simulation. Indeed RMi simu- 
lations for implicit solvent models, based on the solvent- 
accessible area method implemented in [17], have already 
been performed [20] . The algorithmic improvements were 
similar as found for the vacuum situation. However there 
is evidence [20,21] that the class of solvent models used 
does not parametrize the solvent interactions properly. 
It appears that quite generally the reliability of implicit 
solvent models has not yet been well established. 

Finally we like to mention that MC moves may be fine- 
tuned on a local level as done in the approach of Rcf. [22] . 
This is also possible for models which include solvents 
explicitly. So MC may still be a viable alternative to 
molecular dynamics for explicit solvent models. 
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